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Abstract 

Maritime domain awareness is critical for protecting sea lanes, ports, 
harbors, offshore structures and critical infrastructures against common 
threats and illegal activities. Limited surveillance resources constrain mar¬ 
itime domain awareness and compromise full security coverage at all times. 
This situation calls for innovative intelligent systems for interactive situ¬ 
ation analysis to assist marine authorities and security personal in their 
routine surveillance operations. In this article, we propose a novel situ¬ 
ation analysis framework to analyze marine traffic data and differentiate 
various scenarios of vessel engagement for the purpose of detecting anoma¬ 
lies of interest for marine vessels that operate over some period of time 
in relative proximity to each other. The proposed framework views vessel 
behavior as probabilistic processes and uses machine learning to model 
common vessel interaction patterns. We represent patterns of interest 
as left-to-right Hidden Markov Models and classify such patterns using 
Support Vector Machines. 


1 Introduction 

We consider marine traffic scenarios where two or more boats or ships engage in 
some form of coordinated interaction while operating over some period of time 
in relative proximity to each other. Maritime surveillance data is interpreted as 
sequences of discrete observations (e.g., position, heading, speed) over time that 
render vessel trajectories. We model, analyze and differentiate vessel interaction 
scenarios based on a combination of kinematic, geospatial and other features for 
the purpose of detecting anomalies of interest in near real time. We generalize 
here the classical concept of rendezvous to encompass any number and type of 
marine vessels, including all ships and boats regardless of their size and function, 
as well as offshore structures like oil and gas rigs, and to any kind of interaction 
between vessels that operate in relative proximity to each other or an offshore 
structure over some period of time. Beyond kinematic features (e.g., speed), we 
also analyze geospatial aspects (e.g., operating in restricted areas), geometric 
aspects (e.g., proximity to other vessels), contextual information (e.g., vessel 
size and cargo type) and domain knowledge (e.g., possible impact of a scenario) 
to provide a more holistic picture of a situation as it unfolds. 
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2 The Proposed Framework 

The classification and detection process is divided into three separate yet logi¬ 
cally connected phases: Engagement Detection, Scenario Detection, and Anomaly 
Detection (see Figure [I]). 



Figure 1: Overview of the Proposed Framework 


2.1 Engagement Detection 

The purpose of this phase, which is the first of the three phases, is to identify 
vessels that are within a certain proximity range to each other and determine 
with a high confidence whether they are effectively engaging in any interaction 
scenario. The challenge is to monitor this situation in real time on a continuous 
basis for hundreds, even thousands, of moving marine vessels. However, many 
vessels operate in some distance to each other and are unlikely able to get close 
enough within a foreseeable time period of a few minutes. Thus, a proximity 
sensitive approach, limiting observable interactions to a proximity range S, is 
promising. 

Step 1) Cluster Vessels: The first step of this phase identifies vessels that are 
within a close enough distance to each other. For clustering we use the popular 
DBSCAN [§] algorithm as a density-based clustering approach with parameters 
epsilon = S and minPoints = 2. 
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Step 2) Detect Candidates: The second step evaluates for all pairs of ves¬ 
sels belonging to the same cluster conditions indicating engagement based on 
kinematic features. We have identified a number of conditions that indicate 
engagement of two vessels in an interaction scenario in addition to the one 
exemplified in the following: 

Engaging (vi,Vj : Vessel ) = f 

SoG(u i ) < 9 Vi A SoG (vj) < 9 Vj A (Proximity^*, Vj, S') V Converging^, Vj, S', r)) 

The outcome of the Engagement Detection phase is a set of candidates , pairs of 
vessels which are close enough and possibly engaging in an interaction scenario. 

2.2 Scenario Detection 

In the Scenario Detection phase, the outcome of the Engagement Detection, 
phase is used to analyze candidate trajectories. Hidden Markov Models (HMMs) 
[l], along with Support Vector Machines (SVMs) [2] as multi-class classifiers, 
are used to classify and detect, with notable likelihood, scenarios of interest. 

Step 3) Extract Kinematic Features: Observable entities of this step are 
candidate pairs of vessels for which their combined behavior is to be ana¬ 
lyzed. This step extracts the movement features of candidates which are ei¬ 
ther atomic or composite. An atomic feature refers to a single characteristic 
of an individual vessel v in its corresponding trajectory at a certain time t, 
such as SoG(v, t)\ while a composite feature refers to the composition of two 
identical atomic features of two different vessels Vi and v t at time t, such as 
A SoG(vi, Vj,t ) = SoG{vi , t) — SoG(vj,t ) or Distance{vi , Vj,t). Table[l]lists the 
atomic and composite features used here. 


Table 1: Atomic and Composite Kinematic Features 



Feature 

Description 

Atomic 

SoG 

speed over ground of a vessel 

CoG 

course over ground of a vessel 

RoT 

rate of turn of a vessel 

Composite 

Distance 

distance between two vessels 

A SoG 

difference in SoG of two vessels 

ACoG 

difference in CoG of two vessels 

A RoT 

difference in RoT of two vessels 


Step 4) Generate Combined Observations: This step generates a col¬ 
lection of n combined observations for each candidate. Combined observa¬ 
tion of two vessels ty and Vj, 0(vi,vj,t,t '), refers to a sequence of observa¬ 
tion points (a number of atomic or composite features of vessel Vi and Vj at a 
certain time t such as [ASoG(vi,Vj,t), SoG(vi,t)]) for the same set of features 
o(vi, Vj,t), ..., o(vi, Vj,t') between time t and t'. 
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Step 5) Generate Observation Probabilities using Markov Models: 

Given a number of different combined observations O x and HMMs \ y , we com¬ 
pute P(O x \\ y ), the probabilities of the combined observations, given the models. 
The number of combined observation types and HMMs to be used for the sce¬ 
nario classification depends on the classification strategy. 

Step 6) Classify Scenario: Each of the m scenario classes is modeled by n 
HMMs, each trained based on one of the n different types of combined observa¬ 
tions. This results in a set of n x in HMMs, Ayr,..., Ai >n ,..., \ m ,i, ■ ■ ■, A m>ra . 
Multi-class SVMs are used to classify scenarios based on these nxm probability 
values. 

It is noteworthy that this phase determines the scenario class solely on the basis 
of four kinematic features and their combinations. Detecting both, normal and 
anomalous scenarios, can significantly reduce the overall workload of human 
operators. Although using kinematic features is necessary, this alone is not 
sufficient to provide high confidence feedback. 

2.3 Anomaly Detection 

In the third phase, the outcome of Scenario Detection is linked to contextual 
information and domain knowledge. This extends basic scenario detection be¬ 
yond “what you see” by observing kinematic features to also take into account 
“what you know”. Intuitively, this phase is highly interactive as domain expert 
knowledge plays a crucial role and is invaluable for ‘connecting the dots’. 

Step 7) Verify Contextual Information: One can define basic domain 
knowledge and contextual information of a scenario in terms of first-order logic 
clauses. Once Scenario Detection has identified the scenario class, the contextual 
information and background knowledge related to the scenario are considered 
for the reasoning process (e.g., resolution, answer set programming, etc.) to 
verify and possibly revise the class of the observed scenario. 

Step 8) Calculate Impact: Domain experts define estimated impact values 
(0 < impact < 1) for each scenario class which is providing a measure for the 
estimated damage associated with a scenario. 

Any conflict between the outcomes of the Scenario Detection and Anomaly De¬ 
tection phases may be a possible anomaly and needs to be reported. Ultimately, 
the outcome of the third and final phase is decision-making guidelines and alerts 
for human operators based on the impact of the situation. 
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3 Experimental Evaluation 

We base our experimental evaluation on AIS data collected by U.S. Coast 
GuarcQ using onboard navigation safety devices that transmit vessel position, 
speed, course, etc. The UTM zones 1 to 11 for all year 2009 datasets which 
cover almost the entire west coast of North America are used for the experi¬ 
mental analysis. We consider here five classes of scenarios and Table [2] shows 
the number of occurrences of each class extracted from the AIS dataset. 
Class-A) a cargo/tanker vessel and a towing/tug vessel; 

Class-B) a cargo/tanker/passenger vessel and a pilot vessel; 

Class-C) two tanker vessels; 

Class-D) two passenger vessels; 

Class-E) two search and rescue vessels. 

Table 2: Number of Occurrences Extracted for Each Scenario Class 


Scenario Class 

A 

B 

C 

D 

E 

^Occurrence 

44188 

2372 

252 

671 

378 


3.1 Experimental Design & Performance Analysis 

The samples of each scenario class are partitioned into 10 non-overlapping sets 
and run 10-fold cross validation by considering 90% of the samples for training 
and 10% for testing. This way, it is guaranteed that the portions of scenario 
classes in all 10 training sets are (almost) equal; and, more importantly, there are 
two disjoint sets of sample scenarios for training and testing. We experimentally 
evaluate the effectiveness of using HMMs and SVMs for scenario classification. 
We use the open-source HMM Toolbox [I] and the open-source library LIBSVM 
[5] for training and testing. 

Table 3: Confusion Matrix, and Precision and Recall of Each Individual Class 



Predicted Class 

A 

B 

C 

D 

E 

True Class 

A 

43592 

473 

31 

60 

32 

B 

688 

1673 

1 

8 

2 

C 

56 

0 

196 

0 

0 

D 

89 

15 

0 

550 

17 

E 

72 

13 

0 

23 

270 


Precision 

97.97% 

76.95% 

85.96% 

85.80% 

84.11% 

Recall 

98.65% 

70.53% 

77.78% 

81.97% 

71.43% 


1 http://www.marinecadastre.gov 
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4 Conclusions 


Intelligent systems for interactive situation analysis and decision support in real- 
world situation assessment require innovative methodical approaches to econom¬ 
ically develop robust and scalable solutions. We epitomize this idea here for the 
maritime domain, proposing a generalized model for analyzing vessel interac¬ 
tion patterns, where each pattern refers to a family of multi-vessel scenarios 
with common kinematic and typically also non-kinematic characteristics. This 
analytic approach is novel and combines qualitative with quantitative modeling 
aspects into a 3-phase classification and detection framework serving two impor¬ 
tant practical purposes: 1) reducing the overall volume of observation data to 
be monitored and analyzed by coast guard services and marine authorities; and 
2) providing a prioritized list of highly suspicious and critical scenarios ranked 
according to the anticipated impact, which then calls for human attention. In 
fact, each of the three phases serves as a filter to reduce the volume to be pro¬ 
cessed by the subsequent phase, effectively performing a closer inspection of 
increasingly more relevant threats to safety and security. 

In abstract terms, filtering is done for all vessels that engage in some form 
of observable interaction through: 1) fusing the kinematic features of these ves¬ 
sels into a uniformly analyzable entity; 2) fusing contextual information related 
to these vessels for a deeper analysis of the entity. The experimental results 
show that the accuracy of our proposed approach—using a collection of HMMs 
together with SVMs—is 96.7%, making this framework promising for practical 
use in real-world situation analysis for maritime domain awareness. Although 
the experiments presented here focus on the west coast of North America, the 
expectation is that the same framework produces similar results for any coastal 
waters of North America and beyond. In our future work wc will extend exper¬ 
iments to include the East Coast and the Golf of Mexico. 
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